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Abstract 

For a map of the unit interval with an indifferent fixed point, we 
prove an upper bound for the variance of all observables of n variables 
K : [0, 1]" — > M which are componentwise Lipschitz. The proof is 
based on coupling and decay of correlation properties of the map. We 
then give various applications of this inequality to the almost-sure 
central limit theorem, the kernel density estimation, the empirical 
measure and the periodogram. 

key-words: variance, componentwise Lipschitz observable, almost- 
sure central limit theorem, kernel density estimation, empirical mea- 
sure, periodogram, shadowing, Kantorovich-Rubinstein theorem. 
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1 Introduction 

Nowadays, concentration inequalities are a fundamental tool in probability 
theory and statistics. We refer the reader to, e.g., [Ill [HI [lEl [TTl EO] . In par- 
ticular, they also turn out to be essential tools to develop a non-asymptotic 
theory in statistics, exactly as the central limit theorem and large devia- 
tions are known to play a central part in the asymptotic theory. Besides 
the non-asymptotic aspect of concentration inequalities, the crucial point is 
that they allow in principle to study random variables Zn = K{Xi, . . . , X„) 
that "smoothly" depend on the underlying random variables Xj, but oth- 
erwise can be defined in an indirect or a complicated way, and for which 
explicit computations can be very hard, even in the case where the Xj's are 
independent. 

In the context of dynamical systems, central limit theorems and their 
refinements, large deviations, and other type of limit theorems have been 
proved, almost exclusively for Birkhoff sums of sufficiently "smooth" observ- 
ables. But many natural observables are not Birkhoff sums. Let us just men- 
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tion a typical example (see below for more examples), namely the so-called 
power spectrum, that is, the Fourier transform of the correlation function, 
whose estimator is the integral of the periodogram. This is a very compli- 
cated quantity from the analytic point of view. Besides the computational 
difficulties proper to each observable, one would like to have a systematic 
method to approach the questions of fluctuations of observables, instead of 
designing a particular method for each case. 

A possible method is concentration inequalities. An additional difficulty 
comes in for dynamical systems, namely the fact that we loose independence, 
except in very special cases, and that the mixing properties of dynamical 
systems are not as nice as for stochastic processes encountered usually in 
probability theory, such as Markov chains, renewal processes, etc. So new 
approaches have to be proposed, based on typical tools of dynamical sys- 
tems like the spectral gap (when it exists) of the transfer operator and the 
decay of correlations. The first concentration inequality in this context was 
obtained by Collet et al. |9] for uniformly expanding maps of the interval, 
without assuming the existence of a Markov partition. They obtained the 
so-called Gaussian concentration inequality (also called exponential concen- 
tration inequality) by bounding the exponential moment of any observable 
of n variables only assuming that it is componentwise Lipschitz. They de- 
duced several applications (kernel density estimation, shadowing, etc). In 
the hope of proving concentration inequalities for more general dynamical 
systems, one can start with an inequality for the variance, leading to a poly- 
nomial concentration inequality. This was indeed done in [5] for a large class 
of non-uniformly hyperbolic systems modeled by a "Young tower with expo- 
nential return times" [21]. In |6j, the authors of [5] showed the usefulness of 
this variance inequality (therein called "Devroye inequality") through vari- 
ous examples. Let us also mention another approach based on coupling [H E] 
that gives, e.g., an altenative proof of the Gaussian deviation inequality in 
the case of uniformly expanding maps of the interval, and also used in the 
context of Gibbs random fields. 

Regarding Birkhoff sums of "smooth" observables {e.g., Holder), central 
limit theorems and large deviation estimates have been proved both for sys- 
tems modeled by a Young towers with exponential return-time tail mentioned 
above and those with a summable return-time tail, see, e.g., [211 [221 [T8| fT9]. 
So, a natural question is to try to prove an inequality for the variance of 
any observable of n variables only assuming it is componentwise Lipschitz, 
as in [3] , but relaxing the exponential decay of the return-time tail of Young 
towers [52]. This would give a way to analyze fiuctuations of complicated 
observables, which are not Birkhoff sums. The simplest and classical ex- 
ample is a map of the unit interval with an indifferent fixed point. In this 
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paper, we prove a variance inequality for the map T{x) = x + 2"a;^"*"° when 
X e [0, l/2[ and strictly expanding on [1/2, 1], when a is small enough (The- 
orem [3]). However, the proof verbatim applies to the class of maps with a 
unique indifferent fixed point considered in p[3]. The major difference with 
the situation in pi [9] is that the transfer operator has no spectral gap and 
that the decay of correlations is polynomial instead of being exponential. 
Therefore we develop a different approach based on decay of correlations. 
We need to control the covariance of C° functions and Lipschitz functions, 
which is done by H. Hu [I3]. An important ingredient is coupling through the 
Kantorovich- Rubinstein duality theorem. At present, we are not able to con- 
struct explicitely a coupling for the backward process as the one constructed 
in [T] for uniformly expanding maps of the interval. This explicit coupling 
was used in [H] in order to prove the Gaussian concentration inequality. After 
proving the variance inequality, we show various applications of it, namely, 
to the almost-sure central limit theorem, the kernel density estimation, the 
empirical measure, the integrated periodogram and the shadowing. 

The paper is organized as follows. Section [5] contains the necessary infor- 
mations on the maps, while Section [H] contains our main result, namely the 
variance inequality. In Section H] we give various applications of it. Section [5] 
contains the proof of the Devroye inequality. 

2 The map and its properties 

2.1 The map and the invariant measure 

For the sake of definiteness, we consider the maps T : [0, 1] O such that on 
[0,l/2[ 

T{x) =x + 2V+" 

and such that \T'x\ > 1 and |T"(x)| < cxd on [1/2, 1]. In fact, all what follows 
is valid under the assumptions of H. Hu [T3j. 

For a G [0, 1[, this map admits an absolutely continuous invariant prob- 
ability measure d^{x) = h{x)dx, where h{x) ~ when x tends to 0. 

We define the sequence of points xi hy Xq = 1, Xi = 1/2 and for £ > 2 
T{xi) = Xi^i and xi < 1/2. It is easy to verify that the sequence of intervals 

h ■=]xi+i, Xi], 

for £ = 0, 1, 2, . . ., is a Markov partition of the interval ]0, 1]. 
We have the behavior, see e.g. [13] , 

|/^| ~ f^Q^-*-, Xi ~ (1) 
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2.2 Decay of correlations 

The covariance or correlation coefficient CoVi,_^(£) of two L'^ifi) functions 
V, w : [0, 1] — s> M is defined, as usual, by 

Cov^_^„(£) = J V o w dfi — J V dfi J w dfi. 

When V = w, we simply write CoVj,. 

Various people established the (optimal) decay of correlations for the map 
T, namely Cov„,„(£) ^ In, e.g., [22] this is proved for u,v both being 

Holder. As it will turn out, we need the following estimate proved in [TH] . 
There exists a constant C > such that, for all f G C° and w Lipschitz, we 
have the following decay: 



where 

4+1 



Cov,,^(£)| < C||i;||co LipH 7, (2) 
:= r^+i (3) 



and where 

\w{x) -w{x')\ 
Lip(w) = sup j 



This follows from [T3j. The fact that C does not depend on v, w is the 
consequence of Theorem B.l in [6]. 

2.3 Central limit theorem 

Using flSi Proposition 5.2] and [I5j, we have a central limit theorem for 
Lipschitz observables when < a < 1/2: for any v Lipschitz which is not of 
the form h — hoT and such that J vd^ = 0, we have 

^(^y^oT^ <tV-i^^ r e-«'/2rfe, VteM, (4) 



where 

oo 

a2 = Cov„(0) + 2 ^ Cov„(£) > 0. 
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3 Variance inequality 



Our main theorem is an upper-bound for the variance of any componentwise 
Lipschitz function. 

We introduce the convenient notations 

T^{x) = {TP{x),TP+\x),...,T'^{x)) and z^^ = z^, Zp+,, . . . , z, 

for < p < q. With this notation, if we take a function K of n variables, we 
write, e.g., K{z{, zj, zj^^) for K{zi, Z2, ■ . ■ , Zn). 

A real- valued function K on [0, 1]" is said to be componentwise Lipschitz 
if, for all 1 < j < n, the following quantities are finite: 

Lip^(Aj := sup sup ^-j — 

Z-i_,Z2,...,Zj,Zj + l,...,Zn ZjJ^Zj \Zj Zj\ 

Our main theorem reads as follows. 

Theorem 3.1. Let T be the map defined in Section\^ Then, for any a G 
[0,4 — vT5[, there exists D = D(a) > such that, for any componentwise 
Lipschitz function K : [0, 1]" — >• M, we have 

/\ 2 n 
K{Tr\y)) di^iy) dfxix) < D (5) 
/ j=i 

(This inequality is called "Devroye inequality" in [HIE].) 

An application of Chebychev's inequality immediately yields the following 
concentration inequality. 

Corollary 3.2. Under the assumptions of Theorem\^ we have 



/i l^x G [0, 1] : 
for all t>0. 



K{T-~\x))~ K{Tr\y))dM 



> t] < 



Remark 3.3. In our context, we cannot expect a Gaussian concentration 
bound. This would give a Gaussian concentration inequality incompatible 
with large deviation lower bounds obtained in /7g| / where, for a large class of 
Holder observables v, it is proved that for e > small enough 



fiilxe [0,1]: 



SnV{x) 



n 



> e 



for any 6 > and infinitely many n 's, where SnV = v + voT+--- + voT^ ^. 
This type of inequalities was also obtained in flJ^ under different conditions. 
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4 Some applications 



We now give some applications of the variance inequality (JSj). We follow [6] 
where we obtained them in an abstract setting: therein we assumed that 
(Xfc) was some real- valued, stationary, ergodic process satisfying ([5]), plus 
eventually an extra condition on the auto-covariance of Lipschitz observables, 
depending on each specific application. By ([2]) we have 

oo 

^Cov^(£) < C\Up{v)f 
e=i 

where C = CY^^je < oo. This condition will be sufficient to apply all the 
results from [6j that we will use. 

The standing assumption in this section is that < a < 4 — vTS, so that 
Theorem [3] holds. 

4.1 Almost-sure central limit theorem 

For an observable v such that / vdfi = 0, define the sequence of weighted 
empirical (random) measures of the normalized Birkhoff sum by 

1 " 1 

k=l 

where = ELi I- 

We say that the almost-sure central limit theorem holds if for fi almost 
every x, An{x) converges weakly to the Gaussian measure. In fact, we will 
prove a stronger statement, namely that the convergence takes place in the 
Kantorovich distance. 

Let us recall that the Kantorovich distance between two probability mea- 
sures fj,i and fj,2 on M is defined by 

/i2) = sup / g{^) d{fii - /ia) (0 (6) 
gee J 

where C denotes the set of real- valued Lipschitz functions on R with Lipschitz 
constant at most one. 

We denote by ^(0,cr^) the Gaussian measure with mean zero and vari- 
ance cr^. 

Theorem 4.1. Let v be a Lipschitz function which is not of the form h — hoT 
and assume that J vdfi = 0. Then, for fi almost every x, one has 

lim K (X(x),^(0,a^)) = 0. 
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The theorem is an immediate apphcation of Theorem 8.1 in [6J and (jlj). 

Notice that this theorem immediately imphes that for fi almost every x 
An{x) converges weakly to the Gaussian measure. The weak convergence is 
proved in [7j by another method (and not only for the present intermittent 
map). In p], a speed of convergence in the Kantorovich distance was obtained 
for uniformly expanding maps of the interval using a Gaussian bound. 



4.2 Kernel density estimation 

We consider the sequence of regularized (random) empirical measures Tin (a;) 
with densities (/i„) defined by 



h^(x;s) = —f2His-T^{x))/c 



where a„ is a positive sequence converging to and such that na„ converges 
to +00, and ip (the kernel) is a bounded, non-negative, Lipschitz continuous 
function with compact support whose integral equals 1. We are interested in 
the convergence in L^{ds) of this empirical density hn{x; ■) to the density h{-) 
of the invariant measure dfi{x) = h{x)dx. This is nothing but the distance 
in total variation between 7^„(x) and fi: 

distTv(^n(a;),/i) = J \ hnix;s) - h{s)\ ds. 

Theorem 4.2. Let ip and an be as just described. Then, there exists a 
constant C = C{ip) > such that for any integer n and for any t > C(a^~" + 
l/{^/naD), we have 

C 

fi {{x G [0, 1] : distTv(^n(a;), yu) > t)} < 



t'^nal 



This theorem is a direct consequence of Theorem 6.1 in [6j (with r 
I -a). 



4.3 Empirical measure 

The empirical measure associated to x, Tx, . . . , T"~^a; is the random measure 
on [0, 1] defined by 

^ n—l 

£-n{x) = — > Stj{x) 
n ^ — ^ 

j=0 
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where 6 is the Dirac measure. From Birkhoff 's ergodic theorem, for jj, almost 
every x this sequence of random measures weakly converges to fi. We want 
to estimate the speed of this convergence with respect to the Kantorovich 
distance (EI) (now used for probability measures on [0, 1]). 

Theorem 4.3. There exists a positive constant C such that for allt > and 
n > 1, we have 

/i Qa; G [0, 1] : //) > t + 

This is an immediate consequence of Theorem 5.2 in [6]. 




4.4 Integrated periodogram 



Let V be an L'^{fi) observable and assume, for the sake of simplicity, that 
/ vd/j, = 0. We recall (see, e.g, [2]) that the raw periodogram (of order n) of 
the process {v o T^) is the random variable 

1 " 

I:Hx) = - 5^e-^- {viT\x))) 
i=i 

where u G [0, 2tt]. The spectral distribution function of order n (integral of 
the raw periodogram of order n) is given by 



In{s', x) ds . 



Let C„(uj) be the Fourier cosine transform of the auto-covariance of f, 



namely 



Cv{^^) = ^^cos(ct;A;) Cov„(A; + 1) . 



fc=0 



We will denote by J^{uj) the following quantity 

r{uj) = / (2d(s) - Cov„(0)) ds = Cov„(0) + 2 V 
Jo 



sin{ujk) 



Theorem 4.4. Let v be a Lipschitz observable. Then there exists a positive 
constant C = C{v) such that for any n>l, one has 

;i + logn)^/3 



( sup \Jl{uo;x)-r{uo)\)d^i{x)<C' 



(^e[0,27r] 



n 



2/3 



This theorem is a direct application of Theorem 3.1, and the remark just 
after it, in |6J. 
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4.5 Shadowing and mismatch 

Let yl be a set of initial conditions with positive measure. If x ^ A, we can 
ask how well we can approximate the orbit of x by an orbit starting from an 
initial condition in A. 

We can measure the average quality of "shadowing" by the following 
quantity: 

^ n— 1 

ZA{x) = -MT\T'{x)-T^{y)\. 

j=0 

Theorem 4.5. Let A be a subset of positive measure. Then, for all n > 1, 
for all t > 0, one has 

,({..[0,il:^.W>;;r(* + 5J)})<4;;' 

We can also look at the number of mismatch at a given precision: for 
e > 0, let 

Z'.Jx) = - mi CardiO <j<n-l : \T^(x) - T^(y) \ > e}. 
n yeA 

Theorem 4.6. Let A be a subset of positive measure. Then, for all n > 1, 
for all t > 0, for any e > 0, one has 

,({.e|0.l]:2U-)>4T(< + gJ)})<3;|^. 

Theorem 14. 51 is a direct application of Theorem 7.1 in [B] whereas Theorem 
14.61 is a direct application of Theorem 7.2 in [Sj. 



5 Proof of Theorem 3.1 



5.1 First telescoping 

Let (X„)„gNo be the stationary process where Xq is distributed according to 
fi and Xi = T(Xj„i) for i > 1. The expectation in this process is denoted by 
E. We abbreviate X/ := (X^, X^+i, . . . , Xj) ior i < j. We denote by the 
sigma- field generated by Xi, Xj+i, . . . , X„ for i < n and by convention J-'^+i = 
{0, [0, 1]}, the trivial sigma-field. We then have the following telescoping 
identity (martingale difference decomposition): 
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ir(Xo,...,X„_i)-E(i^(Xo,...,X„_i)) = J2E{K\j^l-,')-E{K\j^r') 

i=l 
n 



i=l 



The measurable function E (^K\J^_^^) is a function of Xj_i, . . . ,X„_i only. 
When evaluated along an orbit segment Tq~^(x), it takes the value 

E{K\j^l~')iTrix)) = E(ir(Xo,...,X„_i)|xr_Y = ro'^-(^)) 

To obtain the second equality, notice that the reversed process (X„_i)i=o,...,n 
is a Markov chain with transition probability kernel 

P(X„ = ..|X,=.)= J: ^^J^My-.) (7, 



and similarly 



The identity ([71) follows at once from Bayes formula and the identity P(Xi 
x\Xo = z) = 6{x-T{z)). 

Since ^f"^ C for i > j, we have the orthogonality property 

E(V,V,) = for^^j, 

and hence 

n 

E{K -E{K) f = ^E(Vf). 



1=1 



The function Vi is ^j"Li -measurable and 

V,{T^-\x)) = E (ir|xr_-/ = T^-\x)) - E {K\X^-^ = T^-'{x)) . 
Hence, by Cauchy-Schwarz inequality, 

V^{T^-\x)) < J P(Xi_i = t/x'IXf-i = T^-\x)) 



X [E {K\Xl-,' = Tr\x)) -E{K\Xl-,' = {x',Tr\x 
h{x') 
h{T{x))\T'{x')\ 



x':T(x')=T(x) 
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For x,x' such that T[x) = T(x'), let 

Mi{x,x') := E {K\Xi = x) -E {K\Xi = x') 

and fil denote the conditional distribution of Xq, . . . , Xj_i given that Xj = x. 
By using the Lipschitz property of K one gets 



\Mi{x,x')\<2Up,{K) + 



and one obtains 



x':T{x')=T{x) 



^(^0 M^) ^2(^ n 

h{T{x))\T'{x')\ ' ^ 



h{x') dfi{x) 
h{T{x)) \T'{x')\ 



X 



x':T{x')=T{x) 



Let us further abbreviate 



Ux,x') := j K{4-\0,Tr"\x)) - dfil,{4-')). (8) 

We then obtain 

n 

E{K-EKf < 8 5^(Lip,(ir))2 

i=l 

n „ 

i=l x':T(x')=T(x) 

(Observe that Tk{x,x) = 0.) 



h{x'] 



h{T{x))\T'{x')\ 



-—Tt{x,x').{9) 



5.2 Second telescoping 

Our aim is now to further estimate the quantity Ti{x, x') by using a second 
telescoping where the decay of correlations (E]) can be used. 
Let 

^fc(x) := E {K{Xo, Xk-u 0, Tr''-\x))\Xk = x) . 
With this notation (IHl) reads 



Tk(x,x') = ^Jx) - ^fc(x'). 



(10) 
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The idea is now to telescope the ^E^^'s by introducing an independent copy 
(^i)jeNo of the process (Xj)jgN(,. We write 

k 

^,{x) = 5^E[ir(Xo^"^r/-^o,T^'=-l(x))-ir(Xo^-^r/^,^o,T^'=-l(x))|x, 
p=i 

+E {KiYo,...,Yk^uO,Tr'-\x))) 

where now E denotes expectation both with respect to the random variables 
X and Y, and where we make the convention that, if (resp. Xj) occurs 
with j < i, then Y (resp. X) is simply not present. 
Combining ( |TOl) and ( ITTi) . we obtain 

p=i 

where /x^"^''^"^"'"^ is the conditional distribution of Xq, . . . , Xp _i given Xji- — 
and where 

u,^,{zl\Tr'-\x)) := 

E(K(4-\r/-\o,Tr^-i(x)) -K(4-^F/_l^o,T^'=-l(x))) , (12) 

where the expectation is taken with respect to Y . Observe that 

\uj,^^{zl-\Tr'-\x))\<U^^{K). 
We now define the distance 




Without loss of generality, we assume inf ^ Lip^ > 0. Hence, equipped 
with the distance dp, [0, l]^"*"^ is a complete, separable, metric space. From 
(|T2l) it follows that 

\ujp_,{zi\xr'-')-ujp-i{zi-\xr'-')\ ^ . 

i.e., for each fixed x, the function Zq~^ ^— ijjp^i{zQ~^ ,Tl'~'^~^{x)) is Lipschitz 
with respect to the distance, with Lipschitz norm less than or equal to 
one. 

Denote by c^'^,(2;q, Zq) the Kantorovich- Rubinstein coupling, associated 
with the distance dp, of the measures /ij.'"' and /i^;^ (cf [T2', Theorem 11.8.2, 
p. 421]). 
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For this coupling we thus have 



dp{zQ, Zq) dd^' , ^{zq, Zq) 



sup ( [fdf^l''-^- [fdf^l','-^). 

/:Lip. (/)<1 ^ J J ^ 



Hence, by the definition of the distance d^ and the Kantorovich-Rubinstein 
duahty theorem [12], one gets 



J2 I ^A4,Tr'~\^)) {d^.'f-'{4)-dnf-''{zl)) 

I [u,{zlTr'-\x)) - u,{zlTr'-\x))\ dc^l7'{zlzl) 



p=0 

< Y.I d,{zizi)de;y{zizi) 

k-l 



J2[ sup ( / / d/.^/-^ - / / d^^l'r') 



(13) 



In order to estimate F^, we will now exploit the fact that for k—j) "large" 
the measure is "close" to the invariant measure [i. More precisely, 

passing from yPj^ to involves k—j) iterations of the normalised Perron- 

Frobenius operator. 

5.3 Distortion and correlation estimates 

We now proceed by estimating the final expression in f|T3l) . 
Define, as usual, the normalised Perron-Frobenius operator 

££w[x) = E {w{Xo)\Xi = x) 

w{y)FiXo = dy\X, = x) 
h{u) 



E 



h{x)\T'{u)\ 



w{u). 



u:T{u)=x 

By the Markov property of the reversed process we have 

^"wix) = E {w{Xo)\Xk = ^)= J ^ivMXo = dy\Xk = x). 
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For / a function of (p + 1) variables, define 



^ ^ h{x)\{Tpy{u)\ ^^^0^"))- 

We then have 

The next three lemmas will be useful. 

Lemma 5.1. Let f be such that Lip^^(/) < 1. Then, for any y,y & le and 

any ■m> 0, we have 

\{^^UM - {^""umi < c, (14) 

y 

where 

c,=o(i)V-i;iEi±iW . 

jzj (p-J + l)''- 

Proof. Observe that it is enough to prove the lemma in the case where / van- 
ishes at some point. The general case follows by adding a constant. Without 
loss of generality, we can assume that =Sf"*/p)(y) < =5f"*/p)(y). Indeed, the 
opposite case would lead to the same estimate because there exists a constant 
C > such that y/y < C, for all y.y E h and all I. 

Since / vanishes at some point and Lip^ (/) < 1, we have |/p(Tq (•))|/Lipp_,_;^(i^) < 
2. We also have |jSf"'/p(r^(-))|/Lipp+i(xf < 2. Now we use the inequality 

^ ^ 3(a-6) ^ l + g 



5 - 1 + 6 

for all a, h such that — 2/3 < 6 < a < 2/3. Therefore, 



3Lipp+i(/s:) 3Lipp+i(i^) 



^ 5 / (^"'/p)(y) + 3Lip,+i(i^) 
- 3 U^™/,)(y) + 3Lip^^,(i^) ^ ^ 
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We have 

(^-/,)(l/) + 3Lip^+i(K) ^ h{y) h{z) T^P+-^y{~z) f{n{z))+?,Uv^_,,{K) 
(^-/p)(y) + 3Lip^+i(A0 - Ky) 'Jf T(p+-)'(z) f{T^,{~z)) + ?>U^^^,{K) 

(16) 

where the supremum is taken over the pairs (2;, z) of pre-images of y and y 
whose iterates he in the same atoms of the Markov partition until p + m. To 
estimate this, we use the bounds 



Ky) < ^ I f^ \y-y\ 
h{y) ~ y 



proved in [13j : the first one follows from the fact that h belongs to the space 
Q [131 P- 502] whereas the second one is [131 Proposition 2.3 (ii) ]. We also 
use the bounds 



\z- z\ ^ ^ \y-y\ 



y 



(17) 



and 



\T^ (z) - {z)\ < 



C 



\TP{z) - TP{z) 



(j9-j + l)i/« TP{z) 

which are proved in the appendix (Lemmas 15.61 and 15. 7p . 
Therefore, using ffTSj) . we get 

/(ro^(^))+3Lip^(ir) ^ , , ,^,,,\fmz))-fmm 



f{T^{z)) + 3Up^^,{K) 



< 1 + 0(1] 



LiPp+i(^) 



< 1 



0(1) ^ Lip^-+i(i^) \TP{z)-TP{~z) 
LiPp+iW^(p-J + l)'/" 



TP{z) 



Using ffTTj) and all the previous bounds in ffTUj) . we obtain 



(^-/,) {y) + 3Lip,+i (^) < 1 + 0(1) \y-y\ 



(=^-/,)(^)+3Lip^+i(K) 



LiPp+i(^) y 



j=Q 



This inequality together with ffT^ completes the proof of the lemma. □ 
Lemma 5.2. Let f be such that Lip^ (/) < 1. Then for any q > we have 



djj < Dp 7g 



l -a 
3 



where 



p ■ 
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Proof. Let M > be an integer and e > to be fixed later on. Recall the 
notation Ii =]xi+i,Xi]. For i < M, we define the sequence of functions 
each vanishing outside li, given by 



fp{x) for X e [xi+i + e\Ie\,xe - e\Ie\] 

. llS/p(^^^^l^^l) X E[xe- e\Ie\,xe]. 

We have the identity 



The decay of correlations ([2]) gives us 



y (^/; - y fi di^j rf/i = ^^sup^^ y w (^/,^ - y d^^j 



< 



ll«llci: 

Cri 



7g, 



since |/p| < Cp and using Lemma F5. II with m = 0. 
On the other hand, we have 



/ 



M 



M 



e=o 



e=M+i 



The optimal bound is obtained with 

e = 7|m^+^ , M = 7g" 

The Lemma follows. 



□ 



Lemma 5.3. Let f be such that Lipj^^(/) < 1. Then for any q > and i > 
we have 



< A(£,g;/,) 
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where 

L 1 9q,u \dx if Cp\Ii\'^<xi J I ggj, I dx 



<^pflf \9q,fp\<ix 



2\ — '■ otherwise 

Xl 



where 



Proof. By ffT^ we have 

\9qjM - 9g,u{y')\ < Cp ^— ^ 

for y, y' G h. 

Hence, if we let J ^ h and y & J, then we have 

\9gjM\ \3<iJp^y') \ '^y' ^^i^f^^y") ~ 9qJ,{y')\ dy' < 



The first case follows by taking J = I^. In the second case, we take J such 
that 

1^1 = ^^ jj9,,u\dx<\h\. 

The lemma is proved. □ 
Now return to ([9]). We have to estimate 



where 



m=l'^'f™ x':x^x',Tix)=T{x') ^ ^ ^ 

and 

where the intervals form the Markov partition defined in Subsection 12.11 
We have the following lemmas. 
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Lemma 5.4. Let 

oo 

Qk-=y^\h\ sup TI{x,x'). 

xeIe,x'eIo:T{x)=T(x') 

Then there exists a constant B > such that for any k 

Si{k) <BQk 

and 

S2ik) <BQu. 

Proof. We first observe that, if m > 1, x G /„, T(x) = T(x'), and x ^ x', 
tfien x' G Jq. Next, using [THi Lemma 4.4 (iv)] and tlie fact that h is bounded 
on /q, we get 

^ h(x')h(x) 

G := sup sup sup . . ... . < oo . 

The bound on Si{k) follows immediately. 
For the bound on 5*2 (fc), we have 



h{x') Tl{x,x') 
h{T{x)) \T'ix')\ 



Note that the term corresponding to m = is absent because x ^ x'. Observe 
that 

, h{x')h{x) 
G := sup sup sup , , ^ ^ |„ . — rr < oo 

m>i x'eim xeio,Tix)=T{x') h{T{x))\T'{x') \ 

and there is a constant C > such that for any m > 1 

|{xG/o|r(x) Gr(/j}| <c 

The lemma follows. □ 



Lemma 5.5. Assume that a G [0,4 — vT5[. Then there exists a constant 
H > such that 

n n 
k=l 3=1 

Proof. Observe that 

sup \Tk{x,x)\ < 

x&Io,x'&I m 

k-l 

Y] sup A(0, k-pjp) + y2 sup A(m, k - p, fp) 

p=o f--^''Pdj,{f)<^ p=o f'-^'Pipim^ 
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= S(0,A;) + S(m, k) 

where 



k-l 



^{j,f^)-=Y] sup A{j,k-pjp) 

p=0 /:LiPdp{/)<l 

for j > 0. 

By Lemma [5.21 



sup / < 7^'-"^/^ 



Both cases of Lemma 15.31 lead to the bound 



sup A(0,A;-p,/,)<O(l)Cp7r 

/:Lip,^(/)<l 



(l-a)/6 



Since a G [0,4 — vT5[, we have 
and Young's inequahty yields 

n n 
k=l m p j=l 

We now bound 



By Lemma [5.31 we get 

Y,\h\ m,kf <A^{k) + A2{k) 



where 



and 

A2(fc) :=8 V|Jf| fy'^ sup / \gk-pj^{x)\dx 
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Observe that 

f 0(1) 

\9k-p,u{x)\dx < -J- \gk-pj,W < ^k-P 

since h\j^ ~ i and by using Lemma I5.2[ Hence, 

k-l - 2 

l-Q 



which imphes, as above, 



vP=0 



2 



k=i j=i 
We now bound A2{k). By Cauchy-Schwarz inequahty, for any 5 > 0, we have 



A,ik)<Oil)J2\Ii\J2TTr2 { f \9k-P,u{x)\dx\ (k-p) 

Observe that, if Lipj^^(/) < 1, then 

ll^gjplU- < Lipp+i(ir). 

Indeed, 



l+<5 



and we use the fact that Lip^^(/) < 1 and that =Sf has L°°-norm equal to 
one. This imphes that, for any < cr < 2, 

A2{k) < 0(l)x 

, tTa /:Lip. {/)<1 \Jh / 



-_0 /^LiPdp{/)<l \Jh 

Using again h\j^ ~ i, Lemma [521 and the fact that Lipp(_ft') < 0{l)Cp, we 
get 



A,ik) < Oil) E ^ E ^g^^ - p) 

I p=0 
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1+5 



Since a G [0, 4 — v 15[, there exist < a < 1 and 6 > such that 
then, 

n n 

k=i j=i 
This ends the proof of Lemma 15.51 □ 

5.4 End of the proof 

We now conclude the proof of Theorem 13. 1[ By (j9]) and (fT9|) , we have 



E{K-EKf < 8^(Lip,(ir))^ 



i=l 



+2^2 [ d^x) Y 

i=l x':T(x')=T(x 



ix')=T{x) ^ ^ ^ ^' 



k k 

The theorem now follows from Lemmas 15.41 and 15.51 



Appendix 

In this appendix we prove the inequalities ( ITTl) and ( fT8|) used in the proof of 
Lemma 15. 1[ We recall that the map T is defined in Section [21 

Lemma 5.6. There exists a constant C > such that for any integer m > 1 
and any pair of points z, z such that for < j < m, T^{z) and T^{z) belong 
to the same atom of the Markov partition. Then one has 

\z-z\ ^ ^|T'"(z) -T™(5)| 



z ~ T'^z 
Proof. We start by proving the following inequality: 



T"'{z)\>Co{^^] (20) 
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where Co > is independent of m and z. 
There are two cases. 

If 2; > 1/2, the inequahty is true provided that Co < 2~*^^+"\ 
Now consider the case z < 1/2. We define an integer g < m as follows. If 
T^{z) < 1/2 for j = 0, 1, . . . , m — 1 then we take q = m. Otherwise q is the 
smallest integer sucht that T'^{z) > 1/2. Since z < 1/2, there is an integer 
i > 1 such that z G J^. Moreover is a diffeomorphism from Ii to li-q. 
From the distortion lemma, see e.g. ITS', Proposition 2.3], we get 



e-q 



where Ci > is independent of q and z. From ([T]) it follows that 

1+a 

/ / I 7' I \ 

IT*? {z)\>C2 



T%z) 



where C2 > is independent of q and z. If q = m then (1201) is proved with 
Co = min(C2, 2~^^+"^). If g < m then we observe that 



|T'"'(z)| = \T^"'-^y{T''{z))\\T'^'{z)\ > C2 



z 

because |T('"-'')'| > 1. Since Ti{z) > 1/2, we obtain 

C2 1 C2 



1+a 



2i+« ^i+a — 2-'-+'^ 

This finishes the proof of inequality ( l20i) . 

To prove the lemma, we first observe that if T"^{z) < z then 

\z-z\< \T-\z) - T-i~z)\ < |r™(z) - T"(5)| 

because the modulus of the T' is larger than or equal to one. The remaining 
case is when T"^{z) > z. We observe that 

1+0 



where we used again the distortion estimates ([131 Proposition 2.3]), ([T]), 
the monotonicity of T™, and where C > is independent m, z. This 
immediately implies 

\z-z\ f z Y iT'^iz) -T'^iS) 



C yT'^iz) J T'^z 
The Lemma is proved. □ 
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Lemma 5.7. There exists a constant C > such that for any integer m > 1 
and any pair of points z, z such that for < j < m, T^{z) and T^{z) belong 
to the same atom of the Markov partition. Then one has 



(m+l)V" T™(z) 

Proof. Observe that if T"^{z) < ^^^^^i/c then the estimate follows at once 
from the fact that the modulus of the derivative of T is larger than or equal to 
one. So we now assume that T'^{z) > ^^^^^^^ . Let £ > be the integer such 
that T"^z G l£. There exists a unique z^, in J^+m such that T"*(2^,) = T"^{z). 
Since z^, is the closest T~'"-preimage of T"^{z) to the neutral fixed point 0, 
one can easily show that there is a constant c > such that, for any m and 
z, one has |T'"'(z)| > c|T'"'(z,)|. 

As in the proof of the previous lemma, we use the distortion estimates 
([IS Proposition 2.3]) and ([1]) to obtain 

\T-\z)\>c'^^±^ 

From the distortion estimates ([131 Proposition 2.3]) we get, using that 
T-(^) e le, 

|r™(z) - T™(5)| > 0(l)|T'"'(z)| \z-z\> 0{l)\T^{z)\ ^^^7^'^" \z - z\. 



i 



This can be rewritten as 





|T™(z) -r'"(5)| 


(£ + m)i+i 







;i + m)^ |T-'(^)I 
The proof of the lemma is complete. □ 
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